cost-sensitive classification
- Asia > Middle East > Jordan (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper provides some theoretical analysis on Positive and Unlabeled Data learning (PU), when only the positive instances and unlabeled instances are available. The authors show that learning from PU is equivalent to a cost-sensitive classification task if the label prior is known. The main contribution of the paper is that, the authors show that using any convex loss leads to inconsistent classifier for PU tasks. Instead, using a non-convex ramp loss gives a consistent estimator. This theoretical justification is supported by experiments, which demonstrate that adopting hinge loss of SVM may result in very bad classification error comparing to using the non-convex ramp loss.
- Research Report > New Finding (0.55)
- Summary/Review (0.48)
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > Alberta (0.14)
- Asia > Middle East > Jordan (0.04)
Optimizing F-Measures by Cost-Sensitive Classification
Shameem Puthiya Parambath, Nicolas Usunier, Yves Grandvalet
We present a theoretical analysis of F -measures for binary, multiclass and mul-tilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.
- North America > United States > Massachusetts > Middlesex County > Newton (0.04)
- Europe > France > Hauts-de-France (0.04)
- Europe > Bulgaria > Sofia City Province > Sofia (0.04)
- Asia > Taiwan (0.04)
Optimizing F-Measures by Cost-Sensitive Classification
We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F-measure optimization tasks.
Analyzing Cost-Sensitive Surrogate Losses via $\mathcal{H}$-calibration
Shah, Sanket, Tambe, Milind, Finocchiaro, Jessie
This paper aims to understand whether machine learning models should be trained using cost-sensitive surrogates or cost-agnostic ones (e.g., cross-entropy). Analyzing this question through the lens of $\mathcal{H}$-calibration, we find that cost-sensitive surrogates can strictly outperform their cost-agnostic counterparts when learning small models under common distributional assumptions. Since these distributional assumptions are hard to verify in practice, we also show that cost-sensitive surrogates consistently outperform cost-agnostic surrogates on classification datasets from the UCI repository. Together, these make a strong case for using cost-sensitive surrogates in practice.
- North America > United States > New York > New York County > New York City (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Jordan (0.04)
Optimizing F-Measures by Cost-Sensitive Classification
Shameem Puthiya Parambath, Nicolas Usunier, Yves Grandvalet
We present a theoretical analysis of F -measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.
- Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
- North America > United States > Massachusetts > Middlesex County > Newton (0.04)
- Europe > Bulgaria > Sofia City Province > Sofia (0.04)
- Asia > Taiwan (0.04)
Optimizing F-Measures by Cost-Sensitive Classification
We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature.
Optimizing F-Measures by Cost-Sensitive Classification
We present a theoretical analysis of F -measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.
- Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
- North America > United States > Massachusetts > Middlesex County > Newton (0.04)
- Europe > Bulgaria > Sofia City Province > Sofia (0.04)
- Asia > Taiwan (0.04)